Recent studies evaluating the bulk microphysical schemes (BMPs) within cloud resolving models (CRMs) have indicated large uncertainties and errors in the amount and size distributions of snow and cloud ice aloft. The snow prediction is sensitive to the snow densities, habits, and degree of riming within the BMPs. Improving these BMPs is a crucial step toward improving both weather forecasting and climate predictions. Several microphysical schemes in the Weather Research and Forecasting (WRF) model down to 1.33-km grid spacing are evaluated using aircraft, radar, and ground in situ data from the Global Precipitation Mission Coldseason Precipitation Experiment (GCPEx) experiment, as well as a few years (15 winter storms) of surface measurements of riming, crystal habit, snow density, and radar measurements at Stony Brook, NY (SBNY on north shore of Long Island) during the 2009-2012 winter seasons. Surface microphysical measurements at SBNY were taken every 15 to 30 minutes using a stereo microscope and camera, and snow depth and snow density were also recorded. During these storms, a vertically-pointing Ku-band radar was used to observe the vertical evolution of reflectivity and Doppler vertical velocities. A Particle Size and Velocity (PARSIVEL) disdrometer was also used to measure the surface size distribution and fall speeds of snow at SBNY. For the 15 cases at SBNY, the WSM6, Morrison (MORR), Thompson (THOM2), and Stony Brook (SBU-YLIN) BMPs were validated. A non-spherical snow assumption (THOM2 and SBU-YLIN) simulated a more realistic distribution of reflectivity than spherical snow assumptions in the WSM6 and MORR schemes. The MORR, WSM6, and SBU-YLIN schemes are comparable to the observed velocity distribution in light and moderate riming periods. The THOM2 is approx 0.25 m/s too slow with its velocity distribution in these periods. In heavier riming, the vertical Doppler velocities in the WSM6, THOM2, and MORR schemes were approx 0.25 m/s too slow, while the SBU-YLIN was 0.25 to 0.5 m/s too fast. Overall, the BMPs simulate a size distribution close to the observed for D < 4 mm in the dendritic, plates, and mixed habit periods. The model BMPs underestimate the size distribution when large aggregates were observed. For D > 6 mm in the dendrites, side planes, and mixed habit periods, the BMPs are likely not simulating enough aggregation to create a larger size distribution, although the MORR (double moment) scheme seemed to perform best. These SBNY results will be compared with some results from GCPEx for a warm frontal snow band observed at 18 February 2012.